36 research outputs found
Finite Energy Survey Propagation for Constraint Satisfaction Problems
The Survey Propagation (SP) algorithm [1] has recently been shown to work well in the hard region for random K-SAT problems. SP has its origins in sophisticated arguments in statistical physics, and can be derived from an approach known as the cavity method, when applied at what is called the one-step replica symmetry breaking level. In its most general form, SP can be applied to general constraint satisfaction problems, and can also be used in the unsatisfiable region, where the aim is to minimize the number of violated constraints. In this paper, we formulate the SP-Y algorithm for general constraint satisfaction problems, applicable for minimizing the number of violated constraints. This could be useful, for example, in solving approximate subgraph isomorphism problems. Preliminary results show that SP can solve a few instances of induced subgraph isomorphism for which belief propagation failed to converge.Singapore-MIT Alliance (SMA
Universal Dependencies Parsing for Colloquial Singaporean English
Singlish can be interesting to the ACL community both linguistically as a
major creole based on English, and computationally for information extraction
and sentiment analysis of regional social media. We investigate dependency
parsing of Singlish by constructing a dependency treebank under the Universal
Dependencies scheme, and then training a neural network model by integrating
English syntactic knowledge into a state-of-the-art parser trained on the
Singlish treebank. Results show that English knowledge can lead to 25% relative
error reduction, resulting in a parser of 84.47% accuracies. To the best of our
knowledge, we are the first to use neural stacking to improve cross-lingual
dependency parsing on low-resource languages. We make both our annotation and
parser available for further research.Comment: Accepted by ACL 201
Activity Recognition from Physiological Data using Conditional Random Fields
We describe the application of conditional random fields (CRF) to physiological data modeling for the application of activity recognition. We use the data provided by the Physiological Data Modeling Contest (PDMC), a Workshop at ICML 2004. Data used in PDMC are sequential in nature: they consist of physiological sessions, and each session consists of minute-by-minute sensor readings. We show that linear chain CRF can effectively make use of the sequential information in the data, and, with Expectation Maximization, can be trained on partially unlabeled sessions to improve performance. We also formulate a mixture CRF to make use of the identities of the human subjects to further improve performance. We propose that mixture CRF can be used for transfer learning, where models can be trained on data from different domains. During testing, if the domain of the test data is known, it can be used to instantiate the mixture node, and when it is unknown (or when it is a completely new domain), the marginal probabilities of the labels over all training domains can still be used effectively for prediction.Singapore-MIT Alliance (SMA
Optimizing F-measure: A Tale of Two Approaches
F-measures are popular performance metrics, particularly for tasks with
imbalanced data sets. Algorithms for learning to maximize F-measures follow two
approaches: the empirical utility maximization (EUM) approach learns a
classifier having optimal performance on training data, while the
decision-theoretic approach learns a probabilistic model and then predicts
labels with maximum expected F-measure. In this paper, we investigate the
theoretical justifications and connections for these two approaches, and we
study the conditions under which one approach is preferable to the other using
synthetic and real datasets. Given accurate models, our results suggest that
the two approaches are asymptotically equivalent given large training and test
sets. Nevertheless, empirically, the EUM approach appears to be more robust
against model misspecification, and given a good model, the decision-theoretic
approach appears to be better for handling rare classes and a common domain
adaptation scenario.Comment: ICML201
Interpretable rumor detection in microblogs by attending to user interactions
We address rumor detection by learning to differentiate between the community's response to real and fake claims in microblogs. Existing state-of-the-art models are based on tree models that model conversational trees. However, in social media, a user posting a reply might be replying to the entire thread rather than to a specific user. We propose a post-level attention model (PLAN) to model long distance interactions between tweets with the multi-head attention mechanism in a transformer network. We investigated variants of this model: (1) a structure aware self-attention model (StA-PLAN) that incorporates tree structure information in the transformer network, and (2) a hierarchical token and post-level attention model (StA-HiTPLAN) that learns a sentence representation with token-level self-attention. To the best of our knowledge, we are the first to evaluate our models on two rumor detection data sets: the PHEME data set as well as the Twitter15 and Twitter16 data sets. We show that our best models outperform current state-of-the-art models for both data sets. Moreover, the attention mechanism allows us to explain rumor detection predictions at both token-level and post-level